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We consider the problem of choosing the best of n samples, out of a large random pool, when the 
sampling of each member is associated with a certain cost. The quality (worth) of the best sample 
clearly increases with n, but so do the sampling costs, and one important question is how many 
to sample for optimal gain (worth minus costs). If, in addition, the assessment of worth for each 
sample is associated with some “measurement error,” the perceived best out of n might not be the 
actual best, complicating the issue. Situations like this are typical in mate selection, job hiring, 
and food foraging, to name just a few. We tackle the problem by standard order statistics, yielding 
suggestions for optimal strategies, as well as some unexpected insights. 

PACS numbers: 2.50.Le 


As a motivating example, consider the problem of the academic hiring committee when conducting a candidate 
search. A large number of candidates apply, and, after filtering only to the highly qualified candidates, their application 
records provide little insight into the multitude of issues that would determine which of these candidates is the “best” 
for the job, which is why we invite several of those candidates for a campus interview. This process requires money, 
time, and effort, so (of course) we don’t invite all candidates. But how many should we invite? Given that the 
evaluation process (the interview) does not provide perfect information about the eventual success of a candidate, 
how much good are we gaining by the interview? If our first candidate does very well, should we make an offer, or 
should we wait to sample more from the pool? If our initial slate of candidates was “just Okay,” what should we 
expect to gain by asking the Dean to let us invite more candidates? The general difficulty is that one would like to 
choose “the best,” but if the candidates aren’t very different, or if our ability to distinguish “the best” is not very 
good, then we may be wasting our resources. 


BACKGROUND 

We consider the problem of maximizing gain, on choosing an item from a large population pool. First, imagine 
that we are presented with n items, randomly selected from some population, where we would like to choose the item 
with the greatest worth , as measured by the value of some attribute which we denote as A. We may treat the value 
of this attribute as a random variable, with distribution determined by the underlying population distribution, and 
denote the attribute value for the ith item as A,. Then 

A m ax(n) = max Ai 

1=1,. ..,71 

is also a random variable, and the standard tools from order statistics may be applied to find the probability distri¬ 
bution for A max in terms of the cumulative distribution function F(a) of that attribute for the population. Imagine 
further, that each measurement — an assessment of the value of an item — carries some cost , monetary or oth¬ 
erwise, so that the total cost of the measurements is C n . Then, the total gain to be gotten from the process is 
g{n) = A max (n) — C n . One important goal is to find n which maximizes g(n)\ how many people should one interview 
before hiring, how many mates should we date before proposing, how many cars to test-drive before buying? etc. 
This problem is treated in Section x. 

A complication arises when the evaluation process of the worth of each item is imperfect, yielding a somewhat 
erroneous value. In that case, the perceived “best” item out of n might not coincide with the actual best, and that 
diminishes the expected gain. The precise effect of noisy measurement, and how to work out an optimal strategy 
despite it, is treated in Section y. The ubiquitous case where the items’ worth and the error in measurement are 
each normally distributed is particularly enlightening, yielding some simple closed-form formulas, and we use it to 
demonstrate the general procedure. 

Some further insights are developed in Section z, where we show that it always pays to sample three items, if it 
pays to sample at all, when the worth distribution and error distribution are both normal. We conclude and discuss 
our findings in Section w. 
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EXPECTED GAIN WITH IDEAL MEASUREMENT 


Order Statistics and Worth 


We begin with the ideal case that the value of each item is assessed perfectly, without any measurement error. 
Consider then a sample of n i.i.d. random variables Xq taken from the distribution p(x ) — the probability density 
function for the worth of our items — which are reordered according to their ascending worth: X^iy X( 2 )> • • • ,X( n y 
Standard order statistics gives us the cumulative distribution function (cdf) for X/ k y. 

*x w = P(X(k) <*) = £ ( n W) J ‘(i - P{x)Y~\ (1) 

j=k 

where P(x) = fl^pix') dx' is the cdf of the items’ worth. Focusing on the largest item selected, we have 


^x ( „) = P(^(„) <x) = P(x) n , rp X(n) (x) = nP(x) n 1 p{x ), (2) 

where the probability density function (pdf) i>x (n) (x) was computed by differentiation. A quick, alternative way to 
obtain this last result is by realizing that P(x) denotes the probability that any of the Xi be smaller than x. Then, 
for the maximal value to be x, we need one of the X, to equal x, say X m = x, while Xj < x for j Y m - This happens 
with probability nP(x) n ~ 1 p(x), since m can be chosen in n different ways. The expected value for this maximal order 
statistic, which we denote as K n , is computed as 

POO 

K n := E[X( n )] =n xP(x) n ~ 1 p(x ) da;. (3) 


A simple variable transformation shows that the analogous result for p{x)' = ap{ax + 6), is K' n = ~{K n — b). 

For the flat distribution: p(x) = 1 for 0 < x < 1 (and zero otherwise), for example, one obtains K n = n/(n+ 1). In 
general, however, no closed form solution exists for K n , but numerical approximations for some distributions can be 
found in most texts on order statistics and are available in statistical software packages. For example, in the special 
case of standard normal variables, these expectations are called rankits , with these values required to make Q-Q plots. 

For large n, a simple, useful approximation for K n , due to Van der Waerden, is given by f K " p(x) dx ~ n/(n + 1). 
(It does give the exact result for the flat distribution of the example.) For the normal distribution, cf)(x) = -^=e~ x2 ^, 

this approximation yields K n ~ 2Vln n. The very slow increase of K n with n is quite typical, with the exception of 
fat-tailed distributions: for p(x) = ax~ 1 ~ a , x > 1 (and zero elsewhere), for example, K n ~ n 1 ^ 01 , which increases 
rapidly for small values of a. 


Costs, Gains, and Optimization 

The value of K n is an increasing function of n, so if the goal is to “get the very best,” the strategy is simply to 
sample as many as possible. In practical situations, however, there is invariably a cost associated with the sampling and 
measuring process: Bringing in candidates for interviews costs money and time; in the animal kingdom, courting many 
potential partners costs energy and time, delaying an eventual union and diminishing the chances for reproduction; 
or searching for the larger fruits exposes a forager to increasing danger, the longer the search, etc. As a decision 
problem, the choice of n should be based on what gives the most net benefit, or gain. 

Denote the cost of measuring the ith item by Cj, with cumulative cost 

n 

C n = Y J °i- ( 4 ) 

i~ i 

The optimal sample size n* would then be given by the optimization problem 

n* = argmax ( K n — C n ) . 

n 

In general, a reasonable assumption might be that the total cost is proportional to the number of samples, with fixed 
cost c per item, i.e., Cj = c and C n = nc. We shall proceed under this assumption. Note, however, that for the case 
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of n = 1, when only one item is picked, there is no point in measurement, since by necessity that one item is the best 
available. Hence, we must also stipulate that C\ = 0 (rather than c). 

The marginal worth for sampling item n, given by 

h ■— K _ K 

1X1 n • ±v n—l ? 

is usually a decreasing function in n, as illustrated by Figure 1, for the normal distribution. The optimal sample size 
n* is then chosen as the largest n such that the marginal worth exceeds the marginal cost: 

k n * > c > fc n . + i, n* > 2. (5) 

If k 2 c, the best strategy is to pick one item at random and keep it, without bothering to measure, as already 
discussed above. 




FIG. 1: Standardized Gain. Plots of k„, the expected maximal statistic for the standard normal (Left), and a log-scale plot 
of marginal K, n — k „-i (Right) show that after a “few” samples, the gain grows very slowly with respect to the number of items 
examined. 

As a simple example, consider p{x) = 1/a for 0 < x < a (and zero elsewhere), for which k n = n ^ + \) ■ If) 
furthermore, a c, then n* ss Ja/c. For the normal distribution, (f> a (x ) = ' e -°° 2 / 2a2 ; we get from Van der 
Waerden’s approximation, k n ss -^= (for n 1), so n* ~ a/c, if a c. Finally, for the freak case of a fat-tailed 

distribution, such as p(x) = cue~ 1_ “, x > 1 (and zero elsewhere), k n ~ r^ 1 / 0 ) -1 , so for a < 1 the gain increases 
indefinitely with n, regardless of the mounting costs. 


THE EFFECT OF MEASUREMENT ERROR 

We now turn to the case when the measurement of each item is not perfect, but associated with some error. For 
simplicity and concreteness, throughout the remainder of the paper we focus on the most common scenario, where 
the worth of the items and the error made in each measurement can both be described by the normal distribution. 
The general case can be treated in much the same way, but is less transparent, since it is then impossible to push the 
analytical calculations as far. 

To avoid any confusion, we denote the normal distribution of zero mean and variance a by <j> a {x), instead of p(x). 
For a = 1, we simply use <j>(x), dropping the subscript. Likewise, we denote the expected maximal statistics of (j> by 
n n (instead of K n ). Note that the expected maximal statistics for cj) a is then <JK n . 

Assume then that the Af s are independent and normally distributed with mean p and variance a. We define the 
return for the ith item as 


Vj — Aj p ,, 


( 6 ) 
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where /i is the worth mean. The Xi s are then i.i.d. random variables, described by the normal pdf 4> a {x) = 
J_ e -x / 2 a _ Assume, further, that each measurement is associated with an error Yj, and that the Yj’s are i.i.d. 

V 27t 

random variables described by the normal pdf <pb(y )■ Thus, the actual value measured for the ith item is 

Wi-Xi + Yi. (7) 

From standard results, we observe that Zi is normally distributed, with mean 0 and variance y/ a 2 + b 2 (see eq. (10), 
below). 

Because the process is independent for each i, we (for the moment) drop the subscript in order to (notationally) 
ease the discussion in understanding the relationship between the measured value W and the actual return X. We 
first ask what is the expected return given a particular measured value? Because sample and measurement error are 
independent, the joint distribution for X and Y is given by 


fxv(x,y) = </> a (x)(t)b(y) = 


1 


2irab 


_(jz + y 

g V 2a 2 2b 2 


( 8 ) 


We perform a change of variables, Y = W — X to find the joint distribution of X and W, 

fxw(x,w) = <f> a (x)<j>b(w - x) = 


1 _ X 2 ( w -x) 2 

g 2a 2 2b? 


2irab 

remarking that the substitution is simplified by the fact that dY/dW = 1. The pdf for measured values W is then 

1 


(9) 


fw(w) = / fxw(x,w)dx = 


Y / 27r(a 2 + b 2 ) 


e *(«=■+**> = , 


while the pdf for a return x, conditioned on a measured value w, is then 

fxw(x,w) 1 _ ( x ~ w l_ 


fx(x\W = w) = 


fw{w ) 


y[2' 


7T(J 


where a = / • The required conditional expectation is then easily obtained: 


/ oo 

xfx{x\W = w) dx 

-oo 


■ w := r] w . 


( 10 ) 


( 11 ) 


( 12 ) 


Armed with this result we can now complete our original goal, of determining the expected return on selecting the 
largest item based on measured values. Formally, we define 

Q = X k 


where k satisfies 


W k = W (n) , 

the largest order statistic of the sample measured values. Stated more directly, Q represents the return of the item 
that measured to be the largest. We may compute the required expectation of Q by integrating (12) against tpw ln) i 
the pdf for W( n ) — obtained from (10) and (2) — to yield, 

/•OO 

E[Q] = / E[W |W = w\i/)W( n - ) (u’) dui 


a 2 + b 2 


a 2 + b 2 

= {a^n)—/== = y(a,K n ). 

V a + o 

Note that an n is the result one expects in the ideal case, when there is no measurement error. The net effect of 
measurement error, then, is to degrade the gain that could be obtained in the ideal case, by the factor rj = a/y/a 2 + b 2 . 
(This would not be the case for distributions other than normal, in general, but one expects qualitatively similar 
behavior.) For fc < o, i) rj 1 - j-? and the degradation is minimal (and vanishing as b —> 0). For b /$> a, however, 
r) rj a/b and the degradation is large. The latter case explains our “common sense” understanding of two common 
situations: 


E[W(n)] 


\/a 


b 2 K ri 


(13) 
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• If there is not much difference between items (a small), don’t bother to measure, just pick one. 

• If you can’t tell the difference between items (6 large), don’t bother to measure, just pick one. 


SO, HOW MANY SHOULD WE TRY? 

Well, we have already answered this question, formally, by providing a way to compute n*, the optimal number 
of trials. But often one’s search is less well planned, or the optimal strategy cannot be followed, due to external 
constraints (e.g., the funding for bringing interviewees on campus comes from your Dean). Here we develop two 
important strategies to help deal with such problems. The first strategy ignores pre-planning, and addresses the 
immediate question whether to sample once more, based on what we already have at hand. The second strategy 
establishes a reasonable minimum of tries when one is pressed to terminate the searching prematurely. 


Should we try one more? 


The analysis leading to the criterion of (5) addresses the question of how many items to sample, based on careful 
and deliberate a priori planning. In many instances, however, the sampling process is not pre-planned, but sequential 
(e.g., should I try on one more pair of jeans before making my purchase). In such instances, the decision whether to 
sample one more is based only on the current information — the measured value of the current best choice. 

Suppose that after some amount of sampling our best choice measures to be wo . If we sample one additional item, 
with measured value W, then we would prefer the old sample if IT < wo, but switch to the new if IT > wo- Then the 
expected increase in worth, conditioned on IT = w, would be given by 


h(w) 


0 if w < wq, 

g 2 (w — wq) for w > w o, 


(14) 


where we have applied the result of (12). The unconditional expectation of gain on sampling one more, T4+ , is then 
given by 

/ OO f‘ oo 

h(w)f w (w)dw= h{w)(j) V ^ T pi{w)dw. (15) 

-oo j —oo 


Carrying out the integrals, and expressing the final result in terms of the standard normal distribution (with unit 
variance), we obtain 

Vw 0 = a? #(~o) + *o(T(zo) - 1)] := ariv + (z 0 ), (16) 

where zq = wo/y/a 2 + b 2 . As a general guideline, v + (z) is a rapidly decreasing function of z: v + (z) ~ \z\, for z 0, 
u + (0) = l/-\/27r, and v + (z ) ~ e~ z / 2 , for 2 > 0. The (sequential) decision whether to sample one more is based on 
whether V+ > c (sample!) or not. In Figure xx, we plot the function u + (,j) used for making this decision. 


Try at least three, or none! 

Under some circumstances, there is external pressure to limit the sampling to a small number of items, sometimes 
even when it is clear that a longer search would be more advantageous. For example, the funding for the search, 
such as in the case of hiring new faculty, might come from an external source (the Dean) , and one faces pressure to 
terminate the process as early as possible. We here answer the question “What is a reasonable minimum amount of 
tries?” relevant to such situations. 

We assume that the pdf of the items’ value is normal, with average /i and variance a, and that the pdf of the error 
in measurement is also normal, with variance b (and zero average). Assume furthermore that the cost of measuring 
each item is c. Then, if /r < c, it pays to simply pick one item, at random, without measuring. The expected gain in 
that case is g i = p. 

Does it pay, instead, to try two items? According to our results for selecting with measuring errors, the expected 
maximal worth of two items is agn 2 + Ab so that the expected gain is 52 = + /x — 2c. Thus, it pays to try two 

items if <72 > < 71 , or agK 2 — 2 c > 0 . 
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We shall now prove that K 3 = |« 2 . In that case, the expected gain from trying three items at the outset is 
S 3 = ar]K 3 + fi — 3c = (72 + ( ar)K 2 — 2c)/2. Thus, whenever it pays to try two items, it does pay even more to try 
three! This suggest the following “minimalist” strategy: If you believe that the cost of measuring is too high for even 
a small number of items, then just pick one at random (and don’t bother to measure). Otherwise, try at least three. 

Using the result (3), and exploiting the fact that is an even function of x, while x and $(x) — \ are odd, the 
proof is straightforward: 


K 3 = 3 f x <h(x) 2 </>(x) dx = 3 f x ^<h(x) — i < 
J —00 J —00 . V _ 


-00 

/ OO 

x 

-OO 

3 

= 2 " 2 - 


4 >{x) dx 




/ OO 

x $(x)(f>(x) dx 

-OO 


Incidentally, the above proof also shows that K$ = )K 2 for any pdf that is an even function of its argument, and the 
same symmetry trick can be used to obtain i^ 2n +i in terms of K 2 , K 4 , ■ ■ ■, K 2n -, f° r example, K 5 = etc. 


DISCUSSION AND CONCLUSION 


As primary results from this paper, we briefly restate what we consider as the key analytic contributions: 


1. If we are measuring with error, and determine a particular measured value w, then the expected true value 
(accounting for stochastic differences in the population, not error in our measurement) if given be 


/ OO 

xfx(x\W = w ) dx 

-OO 


b 2 


w := 77 w . 


(17) 


2. If we intend to measure n items and select the item that measures as the best, the expected benefit of that 
process is given by 


V (n, a, b) := n r 


Va 2 + b 2 yj 1 + ( b/af 


= r]an r 


(18) 


3. If we currently have an item which measures wq, then the expected gain in worth, on picking one more item to 
measure, is given by 

v w 0 = a v[0(zo) + 2o($(zo) - 1)] := ar]V + {z 0 ), (19) 

where zo = wo/y/a 2 + b 2 . 

In the context of our original motivating example (the candidate search), we note that item (2) addresses the question 
of how many people the Dean might let us invite, but that decision would still require some means of determining 
costs of a candidate visit measured in the same units as the value of selecting a better candidate. Item (3) addresses 
the question of whether we should make an offer to our current “best candidate,” or should we wait to see another 
candidate. Item (1) relates directly to the issue of the importance of having a good measuring system - the interview 
process itself, where we would remark that m can be reduced through repeated measuring, equivalent to requiring 
the candidate to stay for a longer visit and conduct more interviews. However, it is worthwhile to note that the 
relationship between the math and our illustrative problem is mostly qualitative, in that our normality assumptions, 
as well as the idea that we have some idea of mean and variance of the population and our measuring device is not 
reasonable. 

As a component of discussion, we think it is worthwhile to comment upon the implications of these results. We 
recall that the sampling process can be assumed to have costs, so decision theory principles drive lead us to the simple 
conclusion that we should only sample more items if the expected gain is less than the cost of sampling. Consequently, 
our analytic formulas provide additional insight into the process. 

• The more we sample, the better should be our performance in selection, so long as we do not exceed where 
sampling costs exceed expected benefits. 
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• If our measurement system is not very accurate, we suffer two effects. On the one hand, we are less able to 
select the best item, but, additionally, are expected gain is reduced. For a fixed marginal cost to sample, that 
means we will stop sampling sooner, settling earlier in the process, further reducing are likelihood of finding an 
“exceptionally good” item. 

• As corollary, if we want to find very good items, sampling costs must be very low. 

• As second corollary, if we can reduce are measurement error, it can become cost effective to sample more items. 
As a numerical example, if sampling cost was such that we would have looked at n = 10 items, are standardized 
expected gain is K w « 1.54. If the per item sampling cost were reduced by a factor of 10, then based on the 
marginal benefit being greater than marginal cost, we would sample n = 70 items, with K 70 « 2.38. Based on 
the rapid decay, we would note that the benefit grows roughly with like log n. 

If we examine these principles playing out in arenas such as mate selection, we would (perhaps) have to ignore the 
competitive aspect (your proposed mate must also choose you over other possible mate choices. However, one could 
use these results to infer that performance in the mate selection arena is enhanced if “dating” is cheap. Specifically, 
if we want to find a very good mate, then we must follow be willing to perform more sampling. Biologically, there is 
an inherent risk cost associated with moving from one mate choice to another. We note that there appears to have 
been evolutionary pressure in this direction [1] as it is a well observed phenomena that the body reacts (hormonally) 
to provide increased pleasure during the first stages of a relationship (the thrill of dating). Perhaps this pleasure 
boost should be viewed as decreasing the cost associated with sampling so that there is marginal reason to sample 
additional items before choosing a mate. 


* Electronic address: jskufca@clarkson.edu 
t Electronic address: qd00@clarkson.edu 

[1] David M Buss, The evolution of desire: Strategies of human mating, Basic Books, 2003. 



